Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Short text clustering algorithm based on weighted kernel nonnegative matrix factorization
CAO Dawei, HE Chaobo, CHEN Qimai, LIU Hai
Journal of Computer Applications    2018, 38 (8): 2180-2184.   DOI: 10.11772/j.issn.1001-9081.2018020356
Abstract564)      PDF (918KB)(536)       Save
Clustering analysis of a large number of short texts generated by the Internet is of great application value. Because the characteristics of short texts such as sparse features and difficulty of extracting features, the traditional text clustering algorithm faces many challenges in short text clustering. To solve the problem, a short text clustering algorithm based on Weighted Kernel Nonnegative Matrix Factorization (WKNMF) was proposed by using Nonnegative Matrix Factorization (NMF) model. To make full use of hidden semantic features in short texts for clustering, sparse feature space was mapped to high-dimensional implicit vectors by using kernel method. In addition, kernel trick was used to simplify the complex operation of high-dimensional data, and the weight vectors of short texts were dynamically adjusted through iterative optimization updating rules, so that the importance of different short texts to clustering can be distinguished. Experiments were conducted on real Micro-blog data sets and WKNMF algorithm was compared with K-means, Latent Dirichlet Allocation (LDA), Nonnegative Matrix Factorization (NMF) and Self-Organization Map (SOM). The experimental results show that the proposed WKNMF algorithm has a better clustering performance than the contrast algorithms, its accuracy and Normalized Mutual Information (NMI) reach 66.38% and 66.91% respectively.
Reference | Related Articles | Metrics